Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add batch span processor benchmarks #3017

Merged
merged 1 commit into from
Mar 17, 2021

Conversation

sbandadd
Copy link
Contributor

@sbandadd sbandadd commented Mar 11, 2021

Description:
This PR adds two benchmarks.

  1. Current benchmark executs forceFlush() on every loop and creates a bottleneck which results in not stressing batch span processor. Current benchmark only
    measures throughput which is not helpful on its own since number of spans getting exported is also important. BatchSpanProcessorMultiThreadBenchmark is created to address this issue.
  2. Measuring CPU usage of exporter thread is also important, but the current benchmarks consumes as much CPU as possible which makes the measurement not meaningful.
    To maintain a steady state, this PR creates a benchmark that generates 10k spans per second per thread. One would need to attach a profiler such as yourkit or JProfiler
    to the benchmark run to understand the processor's CPU usage. BatchSpanProcessorCpuBenchmark is created for this purpose.

This PR also fixes a bug in calculating number of dropped/export spans. Earlier dropped and exported spans were actually counting the other one.


private long getMetric(boolean dropped) {
String labelValue = String.valueOf(dropped);
Optional<Long> value =
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have a feeling two loops will be very similar code while more readable

Copy link
Contributor Author

@sbandadd sbandadd Mar 12, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The existing code is two loop and it took me a good amount of time to debug why dropped/exported spans metrics are invalid and got a bit confused on what the two loops are doing. Functional would make it easy to see what are getting filtered and how the data is getting mapped.

That said, I would leave this to the maintainers.

@sbandadd sbandadd force-pushed the sbandadd-add-benchmarks branch 2 times, most recently from d33590f to be485ea Compare March 12, 2021 21:34
Description:
This PR adds two benchmarks.

1. Current benchmark executs forceFlush() on every loop and creates a bottleneck which results in not stressing batch span processor. Current benchmark only
measures throughput which is not helpful on its own since number of spans getting exported is also important. BatchSpanProcessorMultiThreadBenchmark is created to address this issue.
2. Measuring CPU usage of exporter thread is also important, but the current benchmarks consumes as much CPU as possible which makes the measurement not meaningful.
To maintain a steady state, this PR creates a benchmark that generates 10k spans per second per thread. One would need to attach a profiler such as yourkit or JProfiler
to the benchmark run to understand the processor's CPU usage. BatchSpanProcessorCpuBenchmark is created for this purpose.

This PR also fixes a bug in calculating number of dropped/export spans. Earlier dropped and exported spans were actually counting the other one.

This PR also fixes a big in calculating number of dropped/export spans. Earlier dropped and exported spans were actually counting the other ones.
@sbandadd sbandadd force-pushed the sbandadd-add-benchmarks branch from be485ea to 408915f Compare March 15, 2021 05:50
@codecov
Copy link

codecov bot commented Mar 15, 2021

Codecov Report

Merging #3017 (408915f) into main (0d687bd) will increase coverage by 90.72%.
The diff coverage is n/a.

Impacted file tree graph

@@             Coverage Diff             @@
##             main    #3017       +/-   ##
===========================================
+ Coverage        0   90.72%   +90.72%     
- Complexity      0     2813     +2813     
===========================================
  Files           0      324      +324     
  Lines           0     8774     +8774     
  Branches        0      883      +883     
===========================================
+ Hits            0     7960     +7960     
- Misses          0      551      +551     
- Partials        0      263      +263     
Impacted Files Coverage Δ Complexity Δ
...xtension/incubator/trace/data/SpanDataBuilder.java 63.63% <0.00%> (ø) 7.00% <0.00%> (?%)
.../java/io/opentelemetry/sdk/resources/Resource.java 94.44% <0.00%> (ø) 12.00% <0.00%> (?%)
...lemetry/sdk/trace/samplers/ParentBasedSampler.java 100.00% <0.00%> (ø) 21.00% <0.00%> (?%)
...opentelemetry/opencensusshim/OpenTelemetryCtx.java 90.00% <0.00%> (ø) 4.00% <0.00%> (?%)
...lemetry/sdk/common/InstrumentationLibraryInfo.java 100.00% <0.00%> (ø) 4.00% <0.00%> (?%)
...lemetry/sdk/autoconfigure/EnvironmentResource.java 100.00% <0.00%> (ø) 2.00% <0.00%> (?%)
.../sdk/metrics/SynchronousInstrumentAccumulator.java 84.21% <0.00%> (ø) 8.00% <0.00%> (?%)
...y/sdk/autoconfigure/SpanExporterConfiguration.java 72.54% <0.00%> (ø) 13.00% <0.00%> (?%)
...ry/opencensusshim/OpenTelemetryContextManager.java 88.88% <0.00%> (ø) 9.00% <0.00%> (?%)
...ntelemetry/sdk/trace/SdkTracerProviderBuilder.java 89.65% <0.00%> (ø) 10.00% <0.00%> (?%)
... and 314 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 0d687bd...408915f. Read the comment docs.

Copy link
Contributor

@anuraaga anuraaga left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Having fixed-throughput overhead benchmarks is very helpful, thanks!

@jkwatson
Copy link
Contributor

I finally got to spend some dedicated time on this today, and got jmh hooked up with the async-profiler profiling option, and I do see some significant overhead introduced by the usage of the ArrayBlockingQueue (although it's been tough to get really consistent results).

thanks for putting in the work on this. Now, to agree on the best solution to the issue. :)

@anuraaga
Copy link
Contributor

@jkwatson Approved of these benchmarks offline so will go ahead and merge. Benchmarks are alive so I'm sure we'll continue to tweak but this is an improvement. Thanks @sbandadd

@anuraaga anuraaga merged commit 23ce8fe into open-telemetry:main Mar 17, 2021
This was referenced Dec 19, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants